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Abstract 

In distributed storage systems that employ erasure coding, the issue of minimizing the total communication required to exactly 
rebuild a storage node after a failure arises. This repair bandwidth depends on the structure of the storage code and the repair 
strategies used to restore the lost data. Designing high-rate maximum-distance separable (MDS) codes that achieve the optimum 
repair communication has been a well-known open problem. In this work, we use Hadamard matrices to construct the first explicit 
2-parity MDS storage code with optimal repair properties for all single node failures, including the parities. Our construction relies 
on a novel method of achieving perfect interference alignment over finite fields with a finite file size, or number of extensions. We 
generalize this construction to design m-parity MDS codes that achieve the optimum repair communication for single systematic 
node failures and show that there is an interesting connection between our m-parity codes and the systematic -repair optimal 
permutation-matrix based codes of Tamo et al | [2T| and Cadambe et ah p2) , p3) . 

I. Introduction 

Distributed storage systems have reached such a massive scale that recovery from failures is now part of regular operation 
rather than a rare exception |4|. Large scale deployments typically need to tolerate multiple failures, both for high availability 
and to prevent data loss. Erasure coded storage achieves high failure tolerance without requiring a large number of replicas that 
increase the storage cost. Three application contexts where erasure coding techniques are being currently deployed or under 
investigation are Cloud storage systems |5|, archival storage, and peer-to-peer storage systems like Cleversafe and Wuala. 

One central problem in erasure coded distributed storage systems is that of maintaining an encoded representation when 
failuires occur. To maintain the same redundancy when a storage node leaves the system, a newcomer node has to join the 
array, access some existing nodes, and exactly reproduce the contents of the departed node. In its most general form this 
problem is known as the Exact Code Repair Problem 1^, There are several metrics that can be optimized during repair: 
the total information read from existing disks during repair |[8|, |[9|, the total information communicated in the network (repair 
bandwidth |2|), or the total number of disks required for each repair Pl, |T0| |, l^l], |23|. 

Currently, the most well-understood metric is that of repair bandwidth. For designing (n, k) MDS erasure codes that have 
n storage nodes and can tolerate any n — k failures, the information theoretic cut-set bounds for repair communication were 
specified in [2 J and shown to be achievable for all values of n, /c in a series of recent papers jSj, fTSJ , [ ,16J , |T8|-pQ|. In 
particular, it was shown that for a (n, k) code, if a single node fails, downloading fraction of every surviving disk is 
sufficient and optimal in terms of repair bandwidth for the repair of a failed node. Beyond MDS codes, ||2| demonstrated a 
tradeoff between storage and repair communication, and code constructions for other points of this tradeoff are under active 
investigation, see e.g. pO) , or p4) for multiple node repair schemes. On this tradeoff, the minimum storage point is 
achieved by MDS erasure codes with optimal repair, also known as Minimum Storage Regenerating (MSR) codes. 

For code rates k/n < 1/2, explicit MSR codes were designed by Shah et al. L16J, Rashmi et al. |20|, and Suh et al. |15J . For 
the high-rate regime, however, the only known complete constructions p8| , |19|| require large file sizes (symbol extensions) 
and field sizes. These constructions use the symbol extension interference alignment (lA) technique of |11| to establish that 
there exist MDS storage codes, that come arbitrarily close to (but do not exactly match) the information theoretic lower bound 
for the repair bandwidth for all n, k. These asymptotic constructions are impractical due to the arbitrarily large finite field size 
and the fast growing file size, required even for small values of n and k. 

Our Contribution: We introduce the first explicit high-rate (A: + 2, /c) MDS storage code with optimal repair communication. 
Our storage code exploits fundamental properties of Hadamard designs and perfect lA instances that can be understood through 
the use of a lattice representation of the symbol extension technique of Cadambe et al. |11|, [ |18| , |19| . Our coding and repair 
strategy bears resemblance to the notion of ergodic interference alignment \25l , which is a finite-symbol-extension based lA 
scheme in the wireless channels. 

Independently of this work, there has recently been a substantial progress in designing high-rate explicit MSR codes. Tamo 
et al. |21 1 and Cadambe et al. |23 1 designed MDS codes for any (n, k) parameters that have optimal repair for the systematic 
nodes, but not the code parities. It seems that extending these designs to allow optimal parity repair is not straightforward. 



The advantage of our work is that all n nodes are optimally repaired and the disadvantage is that our construction is currently 
only optimal for n — /c = 2. 

Our key technical contribution is a scheme that achieves perfect interference alignment with a finite number of extensions. 
This was developed in |26| and used in 2 parity storage code with optimal repair for k nodes and near optimal repair of 2 
nodes, that can handle any single node failure. We use a combinatorial view of different interference alignment schemes using 
a framework we call dots-on-a-lattice. Hadamard matrices are shown to be crucial in achieving finite perfect alignment and 
ensuring the full-rank of desired subspaces. 

Finally, we present m-parity MDS code constructions based on Hadamard designs that achieve optimal repair for systematic 
node failures, but suboptimal repair for parity nodes. We show that these codes are equivalent to codes that involve permutation 
matrices in the manner of pT| and | [23| under a similarity transformation. 

2-parity Code Parameters: Assuming that the file to be encoded has size M = /c2^+^, each of the /c + 2 storage nodes 
stores a coded block of size Repairing a single node failure costs in repair communication bandwidth, matching 

the theoretic lower bound. Finally, we give explicit conditions on the MDS property of the code and show that finite fields of 
size greater than or equal to 2/c + 3 suffice to satisfy them. 

m-parity Code Parameters: For file sizes M = km^, our {k -\- m^k) codes achieve a repair communication bandwidth of 
M for single systematic node failures, matching the information theoretic lower bound. The MDS property of these 
codes is shown to hold for arbitrarily large finite fields with high probability. 

II. MDS Storage Codes with 2 Parity Nodes 

In this section, we consider the code repair problem for MDS storage codes with 2 parity nodes. After we lay down the 
model for repair, we continue with introducing our code construction. Let a file of size M = kN denoted by the vector 
f G F^^ be partitioned in k parts f = [f^ . . . fj]^, each of size N, where N denotes the subpacketization factor, ^ ^ ^*[^ 
We wish to store f across k systematic and 2 parity storage units each having storage capacity ^ = N, hence we consider a 
data rate of We require that the encoded storage array is resilient up to any 2 node erasures. To satisfy the redundancy 
and erasure resiliency properties, the file is encoded using a {k -\-2^k) MDS distributed storage code. A storage code has the 
MDS property when any possible collections of k storage nodes can reconstruct the file f . 
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Fig. 1. A (fc + 2, /c) Coded Storage Array. 

In Fig. 1 we provide a general structure of a two parity MDS encoded storage array. The first k storage nodes store the 
systematic file parts. Without loss of generality, the first parity stores the sum of all k systematic parts fi + . . . + f/^ and the 



second parity stores a linear combination of them A^ fi 



- A^ f/e. Here, A^ denotes 3.n N xN matrix of coding coefficients 



used by the second parity node to "scale and mix" the contents of the ith file piece f^, i G {1, . . . , /c}. This representation is 
a systematic one: k nodes store uncoded file pieces and each of the 2 parities stores a linear combination of the k file parts. 

In this work, we are interested in maintaining the same level of redundancy when a storage component fails or leaves 
the system. To do that the code repair process has to take place to exactly regenerate the lost data in a newcomer storage 
component. Let, for example, a systematic node i G {1, . . . , /c} fail. Then, a newcomer joins the storage network, connects to 
the remaining nodes, and has to download sufficient data to reconstruct f^. 

It is important to note that the lost systematic part f^, exists only as a term of a linear combination at each parity node, as 
seen in Fig. 1. Therefore, to regenerate the N elements of f^, the newcomer has to download from the parity nodes a size of 
data equal to the size of the lost piece, i.e., N linearly independent coded elements. Assuming that it downloads the same 
amount of data from both parities, the downloaded contents can be represented as a stack of N equations 
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Fig. 2. Repair of a (4, 2) code. Let systematic node 1 fail. Then, a newcomer node joins the system and downloads data from the 3 remaining nodes to 
regenerate fi. The useful information is mixed with the undesired part f2 in both information chunks downloaded from the parities. These interference parts 
are highlighted in red. To retrieve fi a basis of the interference equations needs to be downloaded by systematic node 2. Then, the newcomer can erase 
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p-^"" G are the equations downloaded from the first and second parity node, respectively, and V-^\ V-^^ G 

are the repair matrices. Each repair matrix is used to mix the N parity contents so that a set of ^ equations is formed. 
Then, retrieving from ([T]) is equivalent to solving an underdetermined set of N equations in the kN unknowns of f , with 
respect to the N desired unknowns of f^. However, this is not possible due to /c — 1 additive interference components in the 
received equations generated by the undesired unknowns fg, s G {1, . . . , as noted in ([T]). These k — \ interference terms 
corrupt the desired data and need to be canceled. Hence, the newcomer needs to download additional data from the remaining 
k — \ systematic nodes, that will "replicate" and cancel the interference terms from the downloaded equations. 

To cancel a single interference term of ([T]) that has size N , it suffices to download a basis of equations that generates it. 
The dimensions of this basis does not need to be equal to N . For example, to erase 



fs, s e {l,...,k}\i 



(2) 



the newcomer needs to connect to systematic node s and download a number of linear equations in that can generate ([2]); 
this number is equal to 



N 



< rank 



(A.vf)) 



(3) 



This is exactly the communication bandwidth price we are paying to delete a single interference term in order to be able to 
reconstruct f^. The lower bound in ^ comes from the fact that y linearly independent equations need to be downloaded 
from each of the parities, hence rank(V-^^) = rank(Vp^) = y for any i G {1, . . . , /c}. Eventually, we need to generate all 
undesired terms in the newcomer, so to subtract them from ([T]). Then, a full rank system of N equations in the N unknowns 
has to be formed. A generic example of a code repair instance for a (4, 2) storage code is given in Fig. 2. 

In general, to repair a systematic node z G {1, . . . , /c} of an arbitrary {k + 2, k) MDS storage code, we need to obtain a 
feasible solution to the following rank constrained, rank minimization (performed over ¥q) 
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where i) the full rank constraints correspond to the requirement that the N equations downloaded from the parities are linearly 
independent, when viewed as equations in the N components of and ii) the rank minimization corresponds to minimizing 
the sum of bases dimensions needed to cancel each interference term. For a specific feasible selection of repair matrices the 
repair bandwidth to exactly regenerate systematic node i is given by 
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where the sum rank term is the aggregate of interference dimensions. An optimal solution to TZi is guaranteed to minimize 
the repair bandwidth we need to communicate to the repair systematic node i G 



Fig. 3. A repair optimal (5, 3) MDS code over Fn 



Remark 1: From |2| it is known that the theoretical minimum repair bandwidth, for any single node repair of an optimal 
(linear or nonlinear) (k + 2, k) MDS code is exactly (/c + 1) where N has to be an even number. This bound is proven 
using cut-set bounds on infinite flow graphs. Here, we provide an interpretation of this bound in terms of linear codes by 
calculating the minimum possible sum of ranks in IZi : since each repair matrix has to have full column rank ^ to be a feasible 
solution, the minimum number of dimensions each interference can be suppressed to is This aggregates in a minimum 
repair bandwidth of (/c + l)y repair equations. If we wish to achieve this bound, interference alignment has to be employed, 
so that undesired components like ^ are confined to the minimum number of dimensions. Interestignly, linear codes suffice 
to asymptotically achieve this bound |T8| , |T9| . 

We know what the theoretical minimum repair bandwidth is and that there exist asymptotically optimal schemes, however, 
designing MDS codes with repair strategies that achieve it has been challenging. The difficulty in designing optimal MDS 
storage codes lies in a threefold requirement: i) the code has to satisfy the MDS property, ii) systematic nodes of the code 
have to be optimally repaired, and Hi) parity nodes of the code have to be optimally repaired. Currently, there exist MDS codes 
for rates ^ <\ p^ , |20| for which all nodes can be optimally repaired. For the high data rate regime. Tamo et al |21 1 and 
Cadambe et al. [23] presented the first MDS codes where any systematic node failure can be optimally repaired. However, 
prior to this work, there do not exist MDS storage codes of arbitrarily high rate that can optimal repair any node. 

In the following, we present the first explicit, high-rate, repair optimal (/c + 2, /c) MDS storage code that achieves the 
minimum repair bandwidth bound for the repair of any single systematic or parity node failure. 

III. A Repair Optimal 2 Parity Storage Code 
Let a (A: + 2, k) MDS storage code for file size M = /c2^+^, with coding matrices 

Ai = aiXi + biXk^i + Iat, i e {l,...,k} (5) 

where TV = 2^+\ 

Xi = I2.-1 (g)blkdiag (^Iat , -Ijv^ , (6) 

and ai, hi satisfy of — h^ = —1, for alH G {1, . . . , 

Theorem 1: There exists a finite field ¥q of order q > 2A: + 3 and explicit constants a^, 6^ G Fg, \/i G {1, . . . , /c}, such that 
the {k^k -\-2) storage code in ([S]) is a repair optimal MDS storage code. 

In Fig. 3, we give the coding matrices of a (5, 3) MDS code over Fn based on our construction. 

Remark 2: The code constructions presented here have generator matrices that are as sparse as possible, since any additional 
sparsity would violate the MDS property. This creates the additional benefit of minimum update complexity when some bits 
of the stored data object change. 

Before we proceed with proving Theorem [T] we state the intuition behind our code construction and the tools that we use. 
Motivated by the asymptotic I A schemes, we use similar concepts motivated by a combinatorial explanation of interference 
alignment in terms of dots on lattices. In contrast to the asymptotic lA codes, here, instead of letting randomness choose the 
coding matrices, we select particular constructions based on Hadamard matrices that achieve exact interference alignment for 
fixed in k file sizes (symbol extensions). In section V we prove the optimal repair of systematic nodes, in Section VII we 
show the optimal repair of parity nodes, and in section VIII we state explicit conditions for the MDS property. 



^ We use — 1 to denote the field element q — 1 over ¥q . 



IV. DOTS-ON-A-LATTICE AND HADAMARD DESIGNS 



In Section II, we showed that minimizing the communication bandwidth to repair nodes of a storage code is equivalent 
to the problem of minimizing the dimensions of interference terms generated during each repair process. Here, we consider 
the problem of designing coding and repair matrices that can achieve perfect interference alignment in a finite number of 
extensions. We begin by assuming arbitrary constructions and then we use a combinatorial explanation of lA to find conditions 
under which perfect alignment in the finite file regime. Eventually, we show that exact I A conditions and linear independence 
requirements posed by our problem are simultaneously satisfied through the use of Hadamard designs. 

Assume two arbitrary N x N full rank matrices Ti and T2 that commute. We wish to construct a full rank matrix V, with 
at most Y columns, such that the span of TiV aligns as much as possible with the span of T2V: we have to pick V such 
that it minimizes the dimensions of the union of the two spans, that is the rank of [TiV T2V]. How can we construct such 
a matrix? Assume that we start with one vector with nonzero entries, i.e., V = w, and for simplicity we let it be the all-ones 
vector. Then in the general case, Tiw and T2W have zero intersection which is not desired. However, we can augment V 
such that it has as columns the elements of the set {w, Tiw, T2W, T1T2W}. Observe that each vector T^^T2^w of V can 
be represented by the power tuple (xi,X2). This helps us visualize V as a set of dots on the 2-dimensional integer lattice as 
shown in Fig. 4. 
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Fig. 4. Representing V as dots on a lattice. 



For this new selection of V, we have 



TiV = [Tiw T?w T1T2W T?T2V] 
and T2V = [T2W T2T1W T^w TiT^V] . 



(7) 
(8) 



The intersection of the spans of these two matrices is now nonzero: the matrix [Ti V T2 V] has rank 7 instead of the maximum 
possible of 8. This happens because the vector T1T2W is repeated in both matrices TiV and T2V. In Fig. 5 we illustrate 
this concatenation, in terms of dots on Z^, where the intersection between the two spans is manifested as an overlap of dots. 
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Fig. 5. Representing [TiV T2V] as dots on a lattice. 



Remark 3: Observe how matrix multiplication of Ti and T2 with the vectors in V is pronounced through the dots 
representation: the dots representations of Ti V and T2V matrices are shifted versions of V along the xi and X2 axes. 

The key idea behind choosing a new V at each step is to iteratively augment the old one with products of the matrices 
raised to specific powers times the current V 

initialize : V ^ w (9) 

multiply with powers of Ti: V ^ [V TiV... T7'~V] (10) 

multiply with powers of T2: V ^ [V T2 V . . . T^~V] . (11) 

In general, by using powers up to m, with < we obtain V with columns that are the elements of the set 

V = {Tf T^^w : X. G {0, . . . , m - 1}} , (12) 



where w = Iatxi- Then, matrix V achieves the following property 

< rank ([Ti V T2V]) < (m + l)^ (13) 

which means that we can asymptotically create as much alignment as we desire within the spans of the matrices T^V, for 
arbitrarily large "symbol extensions", i.e. for sufficiently large A^, (m + l)^/m^ is arbitrarily close to 1. For example, we 
give the m = 4 case in Fig. 6, where we observe that the alignment is more substantial (with respect to the size of V) 
compared to Fig. 5. This alignment scheme, in a more general form, was presented by Cadambe and Jafar in |TT| to prove 
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Fig. 6. Representing [TiV T2V] as dots on a lattice. 

the Degrees-of-Freedom of the i^T-user interference channel. For that wireless scenario, the matrices are given by nature 
and are i.i.d. diagonals. Perfect alignment of spaces for these matrices is not known to be possible for finite m pT| , | [28| . 

For network coding problems, and in particular, for storage coding problems, the analogous matrices (our coding matrices) 
are free to design under some specific constraints that ensure the MDS property of the code. Before, we give explicit matrices 
that achieve alignment in a finite number of extensions, we answer the analogous question considering our toy example: do 
there exist Ti and T2 matrices such that we can construct a full-rank V that achieves perfect intersection (exact alignment) 
of the spans of TiV and T2V, for some m and N = m^l That is, can we find matrices such that 

span(TiV) = span(T2V) and rank(V) = (14) 

is possible? We show that a sufficient condition for perfect alignment is satisfied when the elements of the matrices are m* 
roots of unity, i.e., 

Tr = liv. (15) 

To see, that we formally state the dots on a lattice representation. Let a map C from a matrix with r columns, each generated 
as T^^T2^v^, to a set of r points, such that the column T^^T2^w maps to the point (xi,X2). Then, we have for V 

£(V) = {xiGi + ^262; Xi,X2 e [m]} , (16) 

where [m] = {0, ... ,m — 1} and is the i-th column of the identity matrix. Using this representation, the products TiV 
and T2V map to 

£(TiV) = <J(xi + l)ei +X2e2 : xi,X2 G [m]\ and C{T2V) = <Jxiei + (x2 + l)e2 : xi,X2 G [m]\ (17) 



respectively. For perfect alignment, we have to design the matrices such that 

C{TiV)=C{T2V). (18) 
A sufficient set of conditions for perfect span instersection is that V, TiV, and T2V perfectly intersect, i.e. 

£(TiV) £(V) <^ \{xi + l)ei +^262 : xi,X2 G [m] > = \xiei + X2G2 : xi,X2 G [m] >, (19) 



£(TiV) = £(V) ^ <xiei + {x2 + l)e2 : xi,X2 G [m] ^ = IxiGi + X2G2 : xi,X2 ^ [m]\ . (20) 



The above conditions are satisfied when the matrix powers "wrap around" upon reaching a certain modulus m. This wrap-around 
property is obtained when the Ti and T2 matrices have elements that are roots of unity 

= T? = = = In. (21) 



However, arbitrary diagonal matrices whose elements are m* roots of unity are not sufficient to ensure the full rank property 
of V. To hint on a general procedure which outputs "good" matrices, we see an example where we pick them such that 
V has orthogonal columns. Let us briefly consider the case where m = 2 and = 2^, for which we choose 
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For these matrices, V has m? = A orthogonal columns 

V = [w Tiw T2W T1T2W] 



1 1 1 1 ■ 
1-1 1 -1 
1 1 -1-1 
1-1-1 1 
1111 
1-1 1 -1 
1 1 -1-1 
1-1-1 1 



(23) 



and T2V = [T2W T1T2W w Tiw], T3V = [Tiw w T1T2W T2W] indeed have fully overlapping spans. Interest- 
ingly, we observe that for the additional matrix 



T3 = dias 



(24) 



we have that [V T3V] = Hg, where Hg is the 8 x 8 Hadamard matrix. In the following we see that Hadamard designs 
provide the conditions for perfect alignment and linear independence in a more general setting. 

Let m = 2, N = 2^, and = l2i-i blkdiag (liv , — Iat ), for i G [L], and consider the set 

\ 2* 2* / 
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(25) 



Lemma 1: Let an N x N Hadamard matrix of the Sylvester's construction 

A 



Hi 
Hi 



Hi 
Hi 



with Hi = 1. Then, Hat is full rank with mutually orthogonal columns, that are the N elements of Hn- 
The proof of Lemma ([T]) can be found in the Appendix. 

Example To illustrate the connection between Hn and Hat we "decompose" the Hadamard matrix of order 4 



H4 



1111 
1-1 1 -1 
1 1 -1-1 
1-1-1 1 



[w X2W Xiw X2X1W] . 



(26) 



(27) 



where Xi = diag ^ -i ^ ^^'^ -^2 = diag ^ 1^ ^ • ^'^ commutativity of Xi and X2, the columns of H4 are also the 

elements of Hi = {w, Xiw, X2W, X1X2W}. 

Now, consider the matrix Vj that has as columns the elements of 



l[ XJ^w : X, G {0, 1} 



(28) 



We know that the space of Vj is invariant with repsect to X^ since the corresponding lattice representation wraps around itself 
due to Xf = Ijv- Additionally, we have 



C{XiVi) = {ei+ XsBs : x, e {0, 1} \ , 



and we observe that £(X^Vi) D jC(Vi) = 0, i.e., jC{\i) does not include any points with nonzero Xi coordinates. Then, due 
to the orthogonality of elements within Hat, we have 

N 



\C(Vi)\ = \C{XjVi)\ =rank(Vi) = rank(XiVO 



(29) 



for any i^j G {1, . . . , L}. Hence, we obtain the following lemma for the set Hn and its associated C map. 
Lemma 2: For any i^j G {1, 2, . . . , L} we have that 



rank([V, X, V,]) = \C{V,) U C (X, V, 



i = j, 



(30) 



In Fig. 7 we give an illustrative example of the aforementioned definitions and properties. For A/" = 2^, we consider Hg 
and V3 along with the matrix product X2V3 and their corresponding lattice representations. 
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Fig. 7. We set = 8 and show the dots representation of Hg, V3, and X2V3. 

We use the aforementioned properties of Hadamard matrices to construct repair matrices for our code construction; these 
matrices have perfect space alignment properties for the repair instances of the code in ([5]) induced by single node failures. 

Remark 4: Notice that equations ( [22| ) and (23) are respectively analogous to the channel matrices and beamforming vectors 
used in wireless channels for ergodic interference alignment |25|. In particular, for the K user interference channel, the channel 
matrices used for ergodic alignment are diagonalized versions of the column vectors of H2 . 

V. Optimal Systematic Node Repair 

Let systematic node i G of the code in ^ fail. The coding matrix corresponding to the lost systematic 

piece f^, holds one matrix, that is, X^, which is unique among all other coding matrices, A5, s G {1, . . . , k}\i. We pick the 
repair matrix as a set of ^ vectors whose lattice representation is invariant to all XjS but to one key matrix: the unique X^ 
component of A^. We construct the A/" x ^ repair matrix whose columns are the elements of the set 



{fc+i 
n Xf^w:x, G{0,1} 

This repair matrix is used to multiply both the contents of parity node 1 and 2, that is, V^^ = V^^ 
the useful (desired signal) space populated by is 

[Vi AiVi] 

and the interference space due to file part f^, s € {1, . . . , k}\i, is 

[Vi A,Vi]. 



(31) 

Vj. During the repair, 
(32) 

(33) 



Remember that an optimal solution to TZi requires the useful space to have rank N and each of the interference spaces rank 
Y- Observe that the following holds for each of the interference spaces 

N 

- < rank ([V^ (a«X, + 6«Xfe+i + Ijv) V^]) 



for s G {1, . . . , k}\i, since 



< \C (Vi) U £ (X,Vi) U C (Xfc+iVi)| = \C (Vi 



N 



(34) 
(35) 



Then, for the useful data space we have 



> rank ([V^ AiVi]) = rank ([Vi (aiX^ + hXk+i + In) Vi]) 



(*) 



rank ([Vi XiVi]) = |£ (V^) U £ (X^V^ 
£(H^)| =iV, 



(36) 



for any ai ^ 0, where (*) comes from the fact that (aiXj + biX.k+i +lAr) Vj is a linear combination of columns from 
Vi, Xfc_|_iVi, and XiVi. The column spaces of Vi and Xi;_|_iVi are identical, hence we can generate the columns of 



(a^X^ + bi^k-\-i + Iat) by linear combinations of the columns in X^V^ and in V^, however is already in the con- 
catenation [Yi (a^Xi + bi^k-\-i + Iat) Vi]. This means that [V^ X^V^] and [V^ (a^Xi + bi^k-\-i + Iat) V^] have the same 
span. 

Therefore, we are able to generate the minimum amount of interference and at the same time satisfiy the full rank constraint 



of 1Zi. The repair matrix in pl\ is an optimal solution for 1Zi and systematic node i can be optimally repaired by downloading 
{k -\- 1)y worth data equations, for all i G {1, . . . , /c}. In Fig. 8, we sketch the structure of our code. In each block of the 
second parity we denote the key matrices that comprise it. We select our repair matrix such that it "absorbs" all matrices but 
the key one. That way, interference aligns in half the dimensions, and the useful space spans all N dimensions. 
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Fig. 8. A (5, 3) repair optimal code. 



VI. Optimal Parity Repair 



The ingredient of our construction that "unlocks" optimal repair for the first parity is the inclusion of the identity matrix in 
each A^. The same goes for the X/c+i matrix and the repair of the second parity. Both these additionally included matrices 
refine the parity repair process such that optimality is feasible. Selecting appropriate constants and hi is also essential to our 
developments. To optimally solve the problem, we rewrite the parity repair as a systematic one in an equivalent re-interpretation 
of our code. 

A. Repairing the first parity 

Let the first parity node fail. We make a change of variables to obtain a new representation for our code in ([5]), where the 
first parity is a systematic node in an equivalent representation. We start with our (/c, /c + 2) MDS storage code of ^ 
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• Oat-I 
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• Afc J 



(37) 



and make the following change of variables 



yi- 



i=l 



fs =ys, s e {2, 



We solve ( [38] ) and ( |39| ) for fi in terms of the variables and obtain 

k 

fi = yi - X^ys- 



(38) 
(39) 

(40) 



s=2 



Then, we plug ([39]) and ( [40| in ([37]), to have the equivalent representation 
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. Afc-Ai. J 



(41) 



where y = [yf . . .y^]^ ^ . The first parity node of the code in jsj) now corresponds to the node which contains yi in 
the aforementioned representation. The coding matrices under this new representation are 

Ai = aiXi + 6iXfc+i + Iat, (42) 
As-Ai= a^X^ + {bs - 6i)Xfe+i - aiXi, (43) 

for s G {2, . . . , /c}. In contrast to the systematic node repair process, in the following we use a repair matrix of a slightly 
different structure. We construct the repair matrix with columns in the set 



n(XiX,)"^w:x, G{0,1}1 

.8 = 2 ) 



(44) 



Observe that this set is also a subset of Hn- Then, to repair the node of (41 ) that contains yi (i.e., the one that corresponds 
to the first parity node of ([37])) we download XiVo times the contents of the first parity in ( [4T] ) and times the contents 
of the second parity. Hence, during this repair, the useful space is spanned by 



[XiV, AiVj 

and the interference space due to file part y^, s G {2, . . . , /c}, is 

[XiV, (A,-Ai)V,]. 
Before we proceed, observe that the following hold 



£(XiX,Va) = C{Va) 



k+1 \ k+1 ^ 

y^^Xs (mod 2) 1 ei + ^XsC^; Xs € {0, 1} > 

s=2 ) s=2 ) 



k+1 



k+1 



1 + ^a;^ (mod 2) ] ei + ^x^e^; Xs € {0, 1} 



for any s, si,S2 G {1, 



,k+ 1}. The above equations imply that 



C (Va) U £ (Xi Va) =I^J2x,;x,e {0, 1} | = £ (H^) . 



(45) 
(46) 

(47) 

(48) 
(49) 

(50) 



Therefore, we have the following for each of the interference spaces 

N 

— < rank ([XiV, (a.X, + (6, - 6i)Xfe+i - aiXi) V^]) 

< \C (XlVa) U C (XsVa) U C (Xfe+lVa)| 

= |£(XiV,)| = y. 
Moreover, for the useful data space we have 

rank([XiV, (aiXi + 61X^+1 + I^) V,]) = rank([XiV, V,]) 

= \C {Va) U C (XiVJI = \C (H^)| = N, 

Thus, we can perform optimal repair of the node containing yi in ( [4T] ), which is equivalent to optimally repairing the first 
parity of our code in ([5]). 

B. Repairing the second parity 

Here, we have an additional step. We first manipulate our coding matrices of ^ to obtain an equivalent representation for 
the same code. Then, in the same manner we rewrite this code in a form where the second parity of ([5]) is a systematic node 
in some representation. Without loss of generality, we can multiply any coding column block that multiplies the ith file part 



(51) 



(52) 



I 









■N 



(53) 



with a full rank matrix and maintain the same code properties, as shown inj2Q|. In the following derivations, we use the fact 
that X^ = Iat, for any 5 G {1, . . . , /c + 1}. We multiply the i-th block of (plwith a^X^ — 6^X^+1 + In to obtain 



In 

a^Xi + 6iXfc+i + In 



a^Xi — ^iXfe+i + Iat 
iXfc+i + Iat) (ttiXi + 6^X^+1 + Ia^) 



(a^X. 

2a^X, + (a? - 6? + l)lAr 



a^Xi — 6iXfc+i + In 
(a^Xi + Iat)^ - bfiN 



ai'Ki — 6^Xfc+i + Ia^ 

.2 72 



(*) 



a^Xi — ^iXfe+i + Iat 



(54) 



where in (*) we use the fact that af — bf = 0. We continue by multiplying the i-th column block with (a^) ^X^ to obtain 



IaT — CLi 6iX/e+iX^ + a- X^ 

21n 



In 



where in the last step we multiplied the contents of the second parity with 2~^. Hence, let 

A ■ = Iat - a~^6^Xfc+lX^ + a~^Xi, i G {1, . . . , /c}. 

Then, we rewrite our original code as 
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(55) 
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(57) 



where f ' is a full rank row transformation of f . We proceed in the same manner that we handled the first parity repair. We 
make a change of variables such that the second parity becomes a systematic node in a new representation 



(58) 



and obtain the equivalent form 
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(59) 



where 



A'l = Ijv - ar'6iXfc+iXi + ar'Xi, (60) 
A's - A'l = a~^yis - a7^6sXfe+iXs + aj"^&iXfc+iXi - aj"^Xi (61) 

Then, the parity node which corresponds to systematic node 1 here, can be repaired by using Vb with columns in the set 



Vb = jx^^Y n O^i^sT' w : Xk+uXs e {0, 1}| . 

Again, the following equations hold 

£(Xfe+iVb) = £(V6) = I (^^a;, (mod 2)^ ei + ^a;,e,; a;. € {0, 1}| , 

£(X., Vb) = ^(X.^Vb) = H 1 + (mod 2) ei + ^x,e,; Xs € {0, 1} \ , 



and £(X,iXfc+iV6) = C{l^s,Vb), 
for all si,S2 G {1, . . . , k}. Hence, we have for the interfence space generated by component y^, s e {2, . . . , k} 

y < rank ([XiVb (A^ - A[)V,)) 

< \C (X.Vt) U C (XiVt) U C (Xfe+iXiVt) U jC (Xfe+iX.Vb) I 

N 

= |£(XiV6)U£(X,V6)| = -. 
Moreover, the useful space is full rank 

rank([XiVb (Ia, - a^'biXfe+iXi + a^'Xi) V^]) = rank([XiV6 V^]) = A^. 
Thus, we can perform optimal repair for the second parity of the code in (jsjl, with repair bandwidth {k + 1)^- 



(62) 

(63) 

(64) 
(65) 



(66) 



(67) 



VII. The MDS Property 

In this section, we give explicit conditions on the a^, bi constants, for alH G {1, . . . , k}, and the size of the finite field ¥q, 
for which the code in ^ is MDS. We discuss the MDS property using the notion of data collectors (DCs), in the same manner 
that it was used in |2|. A DC can be considered as an external user that can connect and has complete access to the contents 
of some subset of k nodes. A storage code where each node expends ^ worth of storage, has the MDS property when all 
possible (2) DCs can decode the file f . We can show that testing the MDS property is equivalent to checking the rank of a 
specific matrix associated with each DC. This DC matrix is the vertical concatenation of the k stacks of equations stored by 
the nodes that the DC connects to. If all (^) DC matrices are full rank, then we declare that the storage code has the MDS 
property. 

We start with a DC that connects to systematic nodes {1, . . . , /c — 1} and the first parity node. The determinant of the 
corresponding DC matrix is 



det 



/ 


liV 


Ojvxiv 


OjvxJV 


\ 




OivxJV 


.. liV 


OjvxJV 




V 


[ Ijv 


.. liV 


liV J 


/ 



det (In) / 0, 



(68) 



since I at is a full rank diagonal matrix. We continue by considering a DC that connects to systematic nodes {1, . . . , /c — 1} 
and the second parity node. For that we have 
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: det (Afc) / 0, 



(69) 



due to A/c being full rank. 

Finally, we consider DCs that connect to k systematic nodes and both parity nodes. Let a DC that connects to systematic 
node {1, . . . , — 2} and the two parities. The corresponding DC matrix is 
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The leftmost {k — 2)N columns of the matrix in (70) are linearly independent, due to the upper-left identity block. Moreover, 
the leftmost {k — 2)N columns are linearly independent with the rightmost 2N, using an analogous argument. Hence, we need 
to only check the rank of the sub-matrix 

In In 

A/e_l A/c 



(71) 



In the general case, a DC that connects to some k — 2 subset of systematic nodes and the two parities has a corresponding 
matrix where the following block needs to be full rank so that the MDS property can be satisfied 



t-N J-AT 



(72) 



for j G {1, . . . , k} and i ^ j. The code is MDS when 
rank 



Iat Iat 

a^Xi + 6iXfc+i + Iat CLj^j + bj^k+i + Iat 



= rank 



Iat Iat 

a^Xi + biXfc+i + Iat ttjXj + 6jXfc+i + Iat 



In 

OnxN 
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- rank Q^.x.+b-Xfc + i+I^, aiXi-a,Xj+(b,-b,)Xfc + i ] ) 



N 



+ rank (a^X^ — a^X^ + (hi 



for all i, j G {1, . . . , k}, which is true if 



rank(aiXi - ajXj + {bi - bj)Xk^i) 



N 



Since the diagonal elements of X^ are {±1}, the previous requirement gives the lemma. 



(73) 



(74) 



Lemma 3: The code in ^ is MDS when 

i) ai - aj + {bi - bj) ^ 0, (75) 
ii) ai + aj — {bi — bj) ^ 0, (76) 
in) ai — aj — {bi — bj) ^ 0, (77) 
and iv) ai + aj -\- {bi — bj) ^ 0^ (78) 

for all i ^ j e {I, . . . ,k}. 

Now, remember that our initial constraint on the and bi constants was 

a- - b^i = -1 ^ {ai - bi){ai + bi) = -1. (79) 
one solution to the previous equation is the following 

ai-bi = Xi (80) 
a,^bi = -x-\ (81) 



If we input the above solution to ( [79] ), then the MDS equations ([75])-([78]) become 

ai - aj + {bi - bj) = ai ^ bi - {ai + bj) 



^x-^ ^x-\ (82) 
ai + aj - {bi - bj) = ai - bi ^ aj + bj 

= Xi — Xj^ 

^Xi^xJ^, (83) 
ai — aj — {bi — bj) = ai — bi — {aj — bj) 
= Xi — Xj ^ 

^Xii-Xj, (84) 
ai + aj + (bi - bj) = + 6^ + aj - bj 

= -xT^^X,^0 

<^x~^^Xj, (85) 

The above conditions can be equivalently stated as 

Xi 7^ Xj and XiXj ^ 1, (86) 

for any i 7^ j G {1, . . . , A:}. 

Then, consider a prime field of size q. The set of x^s that satisfies our MDS requirements, is such in which no two 
elements are inverses of each other. It is known that, over a prime field, half the nonzero elements are inverses of the other 
nonzero half. If we additionally do not consider Xi G — 1}, then we are left with elements. Therefore, we can 
consider a prime field of size q that has the property 

k < ^ g > 2A: + 3 (87) 

and obtain xi, . . . , x/c such that our requirements are satisfied. Then, the elements and 6^, for alH G {1, . . . , /c}, can be 
obtained through the following equations 

ai = 2-^Xi - 2-^xT^ (88) 

bi = -2-^Xi-2-^x~^. (89) 

Observe that the above solutions yield ai ^ (that is needed for successful repair), for alH G {1, . . . , A:}, when Xi ^ {0,l,g— 1}. 
Therefore a prime field of size greater than, or equal to 2k -\- 3 always suffices to obtain the MDS property. 



VIII. Generalizing to more than 2 parities 

A. m-parity codes with optimal systematic repair 

We generalize the Hadamard design construction of Section III and of the code in f26l, to construct (/c + m, k) MDS storage 
codes for file sizes M = km^ . Our constructions are based on a generalization of the Sylvester construction for complex 
Hadamard matrices that use m* roots of unity. We generate these matrices as 

H^fc =H^0H^.-i, (90) 

where is the m-point Discrete Fourier Transform matrix over a finite field. For example, for m = 3 and F7, we have 
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where p = 2. Then, we consider the set 



where w = l^^xi 



= I^X^^wix, e {0,l,...,m-l}| 
X^ — I^i-i ^ blkdiag (ijv^ , pljv^ , • • • , ^'^"^Ijv^ ) 



Here, p denotes an m* root of unity which yields 



vm T 



(91) 

(92) 

(93) 
(94) 



As with the m = 2 case, there is a one-to-one correspondence between the elements of the set H^fc and the columns of H^fc . 
The general m proof for that property follows the same manner of the m = 2 case, thus we omit it. 

Remark 5: To maintain the full rank property of H^fc , the finite field over which we operate should be chosen such that 
all roots of unity are distinct. The number of distinct roots of unity over a finite field ¥q is given by the number of 
(distinct) solutions of the equation x'^ = 1. This is equal to the order of the cyclic group that generates roots of unity 
within the multiplicative group of ¥q. This subgroup has order m when m divides q — I |27|. 

1) Code construction: Our {k + m, k) MDS code encodes a file f of size M = km^ in the manner of 



^(/c,m) 



where 



^(/c,m) 



I fc I fc 

Al,lXi Ai^2X2 



I k 

Al,/cX/c 



^m-l,k^l ^m-1,2^2 



(95) 



(96) 



with Xij G ¥q. 



2) Optimal repair of the systematic nodes: For this code, let systematic node i G {1, . . . , fc} fail. Then, to repair it we 
construct the repair matrix Vj that has as columns the elements of set 



Vi={ n Xf^w:x, e {0,l,...,m-l} ^ (97) 

This matrix is used to multiply the contents of each of the parity nodes. Here, the useful space during the repair is given by 

[V, X,V, X.^V, . . . Xf-^V,] (98) 

and the interference space generated by systematic component j ^ i is spanned by 

[V, X, V, X|V, . . . Xf-^V,] . (99) 

Due to the modulus-m property of the powers of the X^ matrices, we obtain the following under the lattice representation 

C (V,) = C (X^-V,) and C (x[^ V,) H C (x[^ V,) = 0, (100) 

for any j g {1, . . . /c} 7^ i, and l^h^h ^ {0, . . . , m — 1}, with li ^ I2. The above property and the fact that the elements of 
are linearly independent leads us to the following lemma. 



Lemma 4: For any j G {1, 2, . . . , /c} we have that 

rank( [V, X, V, X|V, . . . XJ^-^ V,] ) = \C{\,) U C (X, V,) U C (X^^V,) U . . . U £ (X^-^ 



m 



k-l 



^ = J, 



(101) 
(102) 



By Lemma Q we see that each of the k — 1 interference terms is confined within m^~^ dimensions and the full rank property of 
the useful space is maintained. This is equivalent to stating that we can repair a single systematic node failure by downloading 
exactly -\- {k — l)m^~^ = (n — l)m^~^ equations, which matches exactly the information theoretic repair optimal of |[2|. 

In Fig. 9 we give an illustration of the repair spaces for a (6,3) code. We sketch the structure of our code on the left of 
the figure. Each parity block is associated with a specific key matrix X^. This allows a selection of that is an invariant 
subspace to all matrices but to the key, one which multiplies the desired and lost file piece. This selection of results in 
perfect alignment of interference in 3^ dimensions, while ensuring a full rank 3^ useful space. 
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Fig. 9. A (6, 3) systematic-repair optimal code. 

3) Suhoptimal repair of the parities: In contrast to our 2-parity code of for this m-parity code, a parity node failure is 
repaired using the scheme of Wu et ah . We first rewrite our code in a new systematic re-interpretation, where the lost parity 
is now in systematic form, in the same manner of the parity repair of our 2-parity code. During the repair, we align a single 
interference block by inverting the corresponding matrices. This induces a repair download of m^~^ + (n — 2)m^ equations, 
which suffices to exactly reconstruct what was lost. This repair strategy is only optimal for (n, 2) codes and asymptotically 
matches the file size for large k. 

4) The MDS property: We establish the MDS property of our m-parity codes in a probabilistic sense: we show that when 
we select the Xi^j variables uniformly at random over a sufficiently large finite field, then the code is MDS with probability 
arbitrarily close to 1. This is shown using the Schwartz-Zippel lemma [29|, [30} on a nonzero polynomial on A^^jS induced 
by the products of all possible DC matrix determinants. 

Let a DC of the code in (95) that connects to k — p systematic nodes and p parities. For simplicity consider that this is 
the DC that is connected to tne last k — p systematic nodes and the first p parity nodes. The induced determinant of the 
corresponding DC matrix will be zero if the following determinant is zero 



det 
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A2.1X? 



(k — p)m'^Xpm'^ (k — p)m'^ 
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(103) 



Since each of the matrices is diagonal, each column of the matrix in the right hand side of (103) has exactly p nonzero 
elements. These, pm^ columns can be considered to fall into groups, with each element of a group having identical 
non-zero support with any other vector in that group. Then, any two columns within a block 



Ai,iXi 
A2,iX? 



(104) 



are orthogonal since their nonzero supports have zero overlap. Hence, a linear dependence will only exist among columns of 
a given non-zero support. We can then rewrite the matrix determinant of (|103|) as 
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iprllp. 



(105) 



where and Pc are the permutation matrices that group the columns and rows of the matrix according to their non-zero 
support so to generate the block diagonal matrix. The p x p matrix is of the form 



.Pip,3l^ip,3l pi: 



2^iv 



ip ijp ^ip ijp _ 



(106) 



where pi^j^ is some root of unity, the indeces depend on i, and no Xij appears more than once within each matrix. We can 
expand the determinant of any matrix using the Leibniz formula, where p\ monomials of degree p appear. Each one of them 
includes a different subset of the Xij variables. Hence, the induced polynomial cannot be the zero polynomial. Therefore, the 

determinant of B^ is a nonzero polynomial of degree p in the Xij variables, hence fli^i 1^*1 ^^^^ ^ ^^^^ polynomial of 
degree pm^ in the Xij variables. Accordingly, we can compute the determinant of each DC in this way. In the same manner, 
each of them will be a nonzero polynomial in Xij. The product of all these determinants be a nonzero polynomial in Xij of 
some degree d. By the Schwartz-Zippel lemma pOj , we know that when we draw Xij uniformly at random over a field of 
size q, this induced polynomial is zero with probability less than or equal to ^. Hence, the MDS property is satisfied with 
probability arbitrarily close to 1, for sufficiently large finite fields. 

IX. Connection to Permutation-Matrix Based Codes 

Here we investigate some interesting connections between our systematic-repair optimal codes of Section VIII and the 
permutation-matrix based codes presented in [21 1 and |23|. Under a similarity transformation, our codes are equivalent to 
ones with coding matrices picked as specific permutation matrices. Multiplying the column space of an matrix of our 
construction with the Hadamard matrix H^fc , yields a matrix that is a permutation of the columns of the Hadamard matrix 



H iXoH^fc = H iH^fcP, P 



(107) 



where P^ is some permutation matrix. This is due to the fact that the elements of H^fc wrap around, i.e., £(H^fc) = £(XiH^fc) 
for any i. 



Example Consider the m = 2, /c = 3 case: 



H23 = [w X2W Xiw X1X2W] 
X1H23 = [Xiw X1X2W w X2W] - H23P1 
X2H23 = [X2W w X1X2W Xiw] = H23P2, 



(108) 
(109) 
(110) 



where Pi and P2 are permutation matrices. The wrap-round property of the columns of the Hadamard matrix produces 
permutations of itself when multiplied by the X^ matrices, and each permutation is distinct. 



Without loss of generality |[l6|, we can rewrite the A*^^'^^ matrix of (95) as 
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where Pij is a permutation matrix. The systematic nodes of this equivalent {k-\-m,m) MDS code can be optimally repaired 
using the repair matrices V^H~^, where has the columns of the set Vi = jlls^i s/i Xf^w : G {0, 1, . . . , m — 
This is true since the rank properties of the correspoding useful and interference spaces remain the same under full rank column 
transformations. Interestingly, this connection is two-way. We give an example of a permutation code from [23j that exactly 
maps to our designs. 

Example We consider the (5,3) permutation code of | [23) , designed for file sizes M = 3 • 2^. The three coding matrices of 
the first parity of this code are three identity matrices Ig. The three coding matrices of the second parity are three permutation 
matrices 

Pi = I{5,6,7,8,1,2,3,4},:7 ^2 = I{3,4,1,2,7,8,5,6},: 7 ^^d P3 = I{2,1,4,3,6,5,8,7},: 7 (112) 

where I{ii,i2,i3,i4,i5,i6,*7,*i8},: indicates a permutation of the columns of the 8 x 8 identity matrix. We know that these matrices 
conmiute, therefore since they are normal, they can be simultaneously diagonalized under a common eigen basis. It can be 
checked that a common basis for the above commuting permutation matices is the Hadamard matrix, which gives 

HgPiHg = Xi, H8P2Hg = X2, H8P3H8 = X3. (113) 

The connection manifested by the above equivalence examples seems very interesting. We believe that further investigation 
on it can lead to better understanding of the repair optimal high-rate MDS code regime. 

X. CONCULSIONS 

We presented the first explicit, high-rate, (A: + 2, /c) erasure MDS storage code that achieves optimal repair bandwidth for 
any single node failure, including the parities. Our construction is based on perfect interference alignment properties offered 
by Hadamard designs. We generalize our 2 -parity constructions to erasure codes with m-parities that achieve optimal repair 
of the systematic parts. 
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Appendix 

Proof of Lemma 1: Observe that Hat = and 

= 2 (I2 ^HnHn^ 

= 2 (^l2 (8) 2 (I2 HnHn^^ 
= 4 (^14 (g) H^H^^ 



2 2 2^2 



= TV • (Iat HiHi) = TV • Iat. (114) 

We also have that N (mod q), for q > 2, thus the rank of Hat is and its columns are mutually orthogonal. □ 
Then, let an A" x A' diagonal matrix 

X^ = l2i-i (g)blkdiag fljv,-Iiv') (115) 

\ 2^ 2* / 

defined for i = {1, . . . , log2(A^)}. X^ is a diagonal matrix, whose elements is a series of alternating Is and —Is, starting with 
1^ Is that flip to —Is and back every ^ positions. We can now expand Hat in the following way 



We proceed in the same manner by expanding all "smaller" Hat s 



Fi =l2xl [l2xl Hjv Xi (^l2xl Hjv ) 

F2 = l23xl0HjV X3 fl23xl (^Hjv) 
23 V 23 / 



l22xi0Hjv X2 ( 122x1 ^ Hjv ) 
22 \ 22 / 



Mog2(Ar)-l 
AT 



[Ia^xI Xiog2(Ar)lArxl] 



where is an TV x ^ matrix. Thus, 



(117) 



span (Hiv) = span ([Fi XiFi]) = span ([F2 X2F2 X1F2 X1X2F2]) 



/[log2(iV) ^\ 

= span \ H X^^w:x. G{0,1}1 , 



which proves the final part of Lemma 1 . 
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